1,853 research outputs found

    Natural Compression for Distributed Deep Learning

    Full text link
    Modern deep learning models are often trained in parallel over a collection of distributed machines to reduce training time. In such settings, communication of model updates among machines becomes a significant performance bottleneck and various lossy update compression techniques have been proposed to alleviate this problem. In this work, we introduce a new, simple yet theoretically and practically effective compression technique: {\em natural compression (NC)}. Our technique is applied individually to all entries of the to-be-compressed update vector and works by randomized rounding to the nearest (negative or positive) power of two, which can be computed in a "natural" way by ignoring the mantissa. We show that compared to no compression, NC increases the second moment of the compressed vector by not more than the tiny factor \nicefrac{9}{8}, which means that the effect of NC on the convergence speed of popular training algorithms, such as distributed SGD, is negligible. However, the communications savings enabled by NC are substantial, leading to {\em 33-4Ă—4\times improvement in overall theoretical running time}. For applications requiring more aggressive compression, we generalize NC to {\em natural dithering}, which we prove is {\em exponentially better} than the common random dithering technique. Our compression operators can be used on their own or in combination with existing operators for a more aggressive combined effect, and offer new state-of-the-art both in theory and practice.Comment: 8 pages, 20 pages of Appendix, 6 Tables, 14 Figure

    LASR-Guided Stellar Photometric Variability Subtraction: The Linear Algorithm For Significance Reduction

    Full text link
    We develop a technique for removing stellar variability in the light curves of δ\delta-Scuti and similar stars. Our technique, which we name the Linear Algorithm for Significance Reduction (LASR), subtracts oscillations from a time series by minimizing their statistical significance in frequency space. We demonstrate that LASR can subtract variable signals of near-arbitrary complexity and can robustly handle close frequency pairs and overtone frequencies. We demonstrate that our algorithm performs an equivalent fit as prewhitening to the straightforward variable signal of KIC 9700322. We also show that LASR provides a better fit to seismic activity than prewhitening in the case of the complex δ\delta-Scuti KOI-976.Comment: 9 pages, 5 figures, accepted for publication in Astronomy & Astrophysics. Pseudocode and github link to code included in manuscrip

    Maestro: Uncovering Low-Rank Structures via Trainable Decomposition

    Full text link
    Deep Neural Networks (DNNs) have been a large driver and enabler for AI breakthroughs in recent years. These models have been getting larger in their attempt to become more accurate and tackle new upcoming use-cases, including AR/VR and intelligent assistants. However, the training process of such large models is a costly and time-consuming process, which typically yields a single model to fit all targets. To mitigate this, various techniques have been proposed in the literature, including pruning, sparsification or quantization of the model weights and updates. While able to achieve high compression rates, they often incur computational overheads or accuracy penalties. Alternatively, factorization methods have been leveraged to incorporate low-rank compression in the training process. Similarly, such techniques (e.g.,~SVD) frequently rely on the computationally expensive decomposition of layers and are potentially sub-optimal for non-linear models, such as DNNs. In this work, we take a further step in designing efficient low-rank models and propose Maestro, a framework for trainable low-rank layers. Instead of regularly applying a priori decompositions such as SVD, the low-rank structure is built into the training process through a generalized variant of Ordered Dropout. This method imposes an importance ordering via sampling on the decomposed DNN structure. Our theoretical analysis demonstrates that our method recovers the SVD decomposition of linear mapping on uniformly distributed data and PCA for linear autoencoders. We further apply our technique on DNNs and empirically illustrate that Maestro enables the extraction of lower footprint models that preserve model performance while allowing for graceful accuracy-latency tradeoff for the deployment to devices of different capabilities.Comment: Under revie

    Improving Performance of Private Federated Models in Medical Image Analysis

    Full text link
    Federated learning (FL) is a distributed machine learning (ML) approach that allows data to be trained without being centralized. This approach is particularly beneficial for medical applications because it addresses some key challenges associated with medical data, such as privacy, security, and data ownership. On top of that, FL can improve the quality of ML models used in medical applications. Medical data is often diverse and can vary significantly depending on the patient population, making it challenging to develop ML models that are accurate and generalizable. FL allows medical data to be used from multiple sources, which can help to improve the quality and generalizability of ML models. Differential privacy (DP) is a go-to algorithmic tool to make this process secure and private. In this work, we show that the model performance can be further improved by employing local steps, a popular approach to improving the communication efficiency of FL, and tuning the number of communication rounds. Concretely, given the privacy budget, we show an optimal number of local steps and communications rounds. We provide theoretical motivations further corroborated with experimental evaluations on real-world medical imaging tasks

    Handling Data Heterogeneity via Architectural Design for Federated Visual Recognition

    Full text link
    Federated Learning (FL) is a promising research paradigm that enables the collaborative training of machine learning models among various parties without the need for sensitive information exchange. Nonetheless, retaining data in individual clients introduces fundamental challenges to achieving performance on par with centrally trained models. Our study provides an extensive review of federated learning applied to visual recognition. It underscores the critical role of thoughtful architectural design choices in achieving optimal performance, a factor often neglected in the FL literature. Many existing FL solutions are tested on shallow or simple networks, which may not accurately reflect real-world applications. This practice restricts the transferability of research findings to large-scale visual recognition models. Through an in-depth analysis of diverse cutting-edge architectures such as convolutional neural networks, transformers, and MLP-mixers, we experimentally demonstrate that architectural choices can substantially enhance FL systems' performance, particularly when handling heterogeneous data. We study 19 visual recognition models from five different architectural families on four challenging FL datasets. We also re-investigate the inferior performance of convolution-based architectures in the FL setting and analyze the influence of normalization layers on the FL performance. Our findings emphasize the importance of architectural design for computer vision tasks in practical scenarios, effectively narrowing the performance gap between federated and centralized learning. Our source code is available at https://github.com/sarapieri/fed_het.git.Comment: to be published in NeurIPS 202

    Transcranial direct current stimulation (tDCS) modulation of picture naming and word reading:A meta-analysis of single session tDCS applied to healthy participants

    Get PDF
    Recent reviews quantifying the effects of single sessions of transcranial direct current stimulation (or tDCS) in healthy volunteers find only minor effects on cognition despite the popularity of this technique. Here, we wanted to quantify the effects of tDCS on language production tasks that measure word reading and picture naming. We reviewed 14 papers measuring tDCS effects across a total of 96 conditions to a) quantify effects of conventional stimulation on language regions (i.e., left hemisphere anodal tDCS administered to temporal/frontal areas) under normal conditions or under conditions of cognitive (semantic) interference; b) identify parameters which may moderate the size of the tDCS effect within conventional stimulation protocols (e.g., online vs offline, high vs. low current densities, and short vs. long durations), as well as within types of stimulation not typically explored by previous reviews (i.e., right hemisphere anodal tDCS or left/right hemisphere cathodal tDCS). In all analyses there was no significant effect of tDCS, but we did find a small but significant effect of time and duration of stimulation with stronger effects for offline stimulation and for shorter durations (< 15 min). We also found some indication of publication bias towards reporting positive effects. We encourage further experimentation in order resolve the disparity between the current popularity of tDCS and its poor efficacy in healthy participants

    Modern Electronic Techniques Applied to Physics and Engineering

    Get PDF
    Contains reports on three research projects

    Methodology for the nocturnal cardiac arrhythmia ancillary study of the ADVENT-HF trial in patients with heart failure with reduced ejection fraction and sleep-disordered breathing

    Get PDF
    Background Sleep disordered breathing (SDB) may trigger nocturnal cardiac arrhythmias (NCA) in patients with heart failure with reduced ejection fraction (HFrEF). The NCA ancillary study of the ADVENT-HF trial will test whether, in HFrEF-patients with SDB, peak-flow-triggered adaptive servo-ventilation (ASVpf) reduces NCA. To this end, accurate scoring of NCA from polysomnography (PSG) is required. Objective To develop a method to detect NCA accurately from a single-lead electrocardiogram (ECG) recorded during PSG and assess inter-observer agreement for NCA detection. Methods Quality assurance of ECG analysis included training of the investigators, development of standardized technical quality, guideline-conforming semi-automated NCA-scoring via Holter-ECG software and implementation of an arrhythmia adjudication committee. To assess inter-observer agreement, the ECG was analysed by two independent investigators and compared for agreement on premature ventricular complexes (PVC) /h, premature atrial complexes/h (PAC) as well as for other NCA in 62 patients from two centers of the ADVENT-HF trial. Results The intraclass correlation coefficients for PVC/h and PAC/h were excellent: 0.99 (95%- confidence interval [CI]: 0.99–0.99) and 0.99 (95%-CI: 0.97–0.99), respectively. No clinically relevant difference in inter-observer classification of other NCA was found. The detection of non-sustained ventricular tachycardia (18% versus 19%) and atrial fibrillation (10% versus 11%) was similar between the two investigators. No sustained ventricular tachycardia was detected. Conclusion These findings indicate that our methods are very reliable for scoring NCAs and are adequate to apply for the entire PSG data set of the ADVENT-HF trial
    • …
    corecore